On sumsets and spectral gaps 

Ernie Croot* 
Georgia Tech 
School of Mathematics 
103 Skiles 
Atlanta, GA 30332 

Tomasz Schoen^ 
Department of Discrete Mathematics 
Adam Michiewicz University 
ul. Umultowska 87, 61-614 Poznah, Poland 

February 1, 2008 



Abstract 

Suppose that S C F p , where p is a prime number. Let Ai,...,A p be the 
Fourier coefficients of S arranged as follows 

\S(Q)\ = |A X | > |A 2 | > ... > |Ap|. 

Then, as is well known, the smaller | A2 1 is, relative to |Ai|, the larger the 
sumset S + S must be; and, one can work out as a function of e and the density 
9 = \S\/p, an upper bound for the ratio |A 2 |/|Ai| needed in order to guarantee 
that S + S covers at least (1 — e)p residue classes modulo p. Put another way, 
if S has a large spectral gap, then most elements of F p have the same number 
of representations as a sum of two elements of S, thereby making S + S large. 

What we show in this paper is an extension of this fact, which holds for 
spectral gaps between other consecutive Fourier coefficients Afc,Afc+i, so long 
as k is not too large; in particular, our theorem will work so long as 

1 < k < 

log 4 

Furthermore, we develop results for repeated sums S + S + ■ ■ ■ + S. 

It is worth noting that this phenomena does not hold in the larger finite 
field setting F p n for fixed p, and where we let n — » 00, because, for example, 
the indicator function for a large subspace of ¥ p n can have a large spectral 
gap, and yet the sumset of that subspace with itself equals the subspace (which 
therefore means it cannot cover density 1 — e fraction of F p n). The property of 
F p that we exploit, which does not hold for ¥ pn (at least not in the way that 
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we would like - Browkin, Divis and Schinzel pQ have analyzed the problem for 
more general settings than just F p ), is something we call a "unique differences" 
property, first identified by W. Feit, with first proofs and basic results found 
by Straus g]. 

1 Introduction 

Supose that 

/ : F p - [0,1], 

and let 

9 := E(/) := p^En/H. 
For an a £ F p , define the usual Fourier transform 

/(«) := En/(«)e 2 ™ n/p . 
We order the elements of F p as 

Ol, dp, 

so that 

|/(ai)| > 1/(02)1 > ... > |/(ap)|; (1) 

(there may be multiple choices for a\, ...,a p - any ordering will do) and, for conve- 
nience, we set 

Ai = f(ai), i = 1, -,p. 

Note, then, that 

|Ai| = |/(0)|. 

In this paper we prove the following basic theorem. 

Theorem 1 Suppose that f : ¥ p — ► [0, 1], / not identically 0, has the property that 
for some 

log 4 

we have that 

|Afc+i| < 7|Afc|- 

Then, 

\{n£F p : (/*/)(«) > 0}| > p(l-2^V|A fe r 2 ). 

Remark. It is easy to construct functions / which have a large spectral gap as in 
the hypotheses. For example, take / to be the function whose Fourier transform 
satisfies /(0) = p/2 and /(l) = /(-l) = p/4, then /(a) = for a / 0,±1. Clearly 
we have / : F p — > [0, 1], and of course / has a large spectral gap between A3 and A4 
(| A3 1 = p/4, while A 4 = 0). 
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By considering repeated sums, one can prove similar sorts of results, but which 
hold for a much wider range of k. Furthermore, one can derive conditions guaran- 
teeing that (/*/*•••* f)(n) > for all n G ¥ p , not just 1 — e proportion of ¥ p ; 
and, these conditions are much simpler and cleaner than those of Theorem Q] above. 
This new theorem is given as follows: 

Theorem 2 Suppose that f : ¥ p — ► [0, 1], / not identically 0, has the property that 
for some 

1 < k < (logp^^tloglogp)- 2 ^ 2 , 

we have that 

\X k +l\ < 7|A fe |, where 7 < t'^-^^W/pf' 1 . 
Then, for t > 3, the t-fold convolution /*/*•■■*/ is positive on all of¥ p . 

Remark. It is possible to prove even stronger results for when k is much smaller 
than t (say less than the square-root of t) , though the result is a little more technical 
to state. 

We conjecture that it is possible to prove a lot more: 

Conjecture. It is possible to develop bounds of the same general quality as to 
those in Theorem Q] for the number of n with (/ * f)(n) > 0, given that / has a 
large spectral gap between the kth and {k + l)st largest Fourier coefficients of /, 
for any k < p 1 ^ 2 , say. This would obviously require a different sort of proof than 
appears in the present paper, as a key lemma we use (Lemma [2]) is close to best- 
possible. Furthermore, it should be possible to prove a version of Theorem [2] under 
the assumption of such a spectral gap. 

2 Some lemmas 

Lemma 1 (Dirichlet's Box Principle) Suppose that 

ri,...,r t G ¥ p . 
Then, there exists non-zero m G ¥ p such that 

where here \\x\\ denotes the distance from x to the nearest integer. 

The proof of this lemma is standard, so we omit it. The following lemma is 
also standard, and was first discovered by Straus [4] (and re-discovered by the first 
author) though we will bother to give the proof. It is worth remarking that Browkin, 
Divis and Schinzel PQ have worked out a more general version of this lemma that 
holds in artibrary groups; and, Lev [2] has extended and applied these results to 
address some problems on discrepancy. 



For % = 1, t, 



mr,; 



P 



3 



Lemma 2 (Unique Differences Lemma) Suppose that 

B := {h,...,b t }C F p . 

Then, if 

t < (logp)/log4, 

there will exist d € F p having a unique representation as a difference of two elements 
ofB. 



Proof of the lemma. First, from the Dirichlet Box Principle above, we deduce 

ero dilation constant m € F p such 

Cj = mbi (modp), \ci\ < p/2, 



that there exists a non-zero dilation constant m£F p such that if we let 



then, in fact, 
So long as 



\ci\ < P 1 ' 1 ' 1 . 



p 1 - 1 ^ < p/4 p > 4*, 

we will have that all these Cj lie in (—p/4, p/4). Then, if we let 

c x : = mincj, and c y := maxcj, 

i i 

we claim that d E B — B given by 

d Cy c x 

has a unique representation as a difference of elements of B , and therefore Cy — c x 
is that unique representation. The reason that this is the case is that since Cj € 
(—p/4, p/4) we have that all the differences 

Ci-Cj G (-p/2, p/2); 

and so, two of these differences are equal if and only if they are equal modulo p; and, 
it is clear that, over the integers, d = c y — c x has a unique representation, implying 
that it has a unique representation modulo p. ■ 

We will actually need a generalization of this lemma, which is a refinement of 
one appearing in [3], and is given as follows. 

Lemma 3 Suppose that 

B U B 2 C F p , 

where 

10 < \Bx\ < p/2, and 3|J3 2 | log |Bi[ >logp. 
Then, there exists d G B\ — Bi having at most 

20|£ 2 |(log|£ 1 |) 2 /logp 
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representations as 

d = b-b', b€Bx,i/ €B 2 . 

Furthermore, if 

1 < |Bi| < p/2, and 3\B 2 1 log \Bi \ < logp, 

i/ien i/iere exists d £ B\ — B 2 having a unique representation as d = b\ — b 2 , b\ G 
BiM G B 2 . 

Proof of the lemma. Let B' be a random subset of B 2 , where each element b G B 2 
lies in B' with probability 

(logp)/3|B 2 |log|Bi|. 

Note that this is where our lower bound 3\B 2 \ log \B\\ > logp comes in, as we need 
this to be less than 1. 

So long as the B' we choose satisfies 

\B'\ < (logp)/21og|Bi|, (2) 

which it will with probability at least 1/2, we claim that there will always exist 
an element d £ B — B' having a unique representation as a difference b\ — b' 2 , 
b\ G B,b' 2 G B'\ First, note that it suffices to prove this for the set C\ — C, where 

C\ = m ■ Bi, C 2 = m ■ B 2 , and C' = m ■ B', 

where m is a dilation constant chosen according to Dirichlet's Box Lemma so that 
every element x G C (when considerecd as a subset of (—p/2, p/2}) satisfies 

\ x \ < pi-i/m < p/3 | Bi |_ 

Now, there must exist an integer interval 

/ := (n, v) n Z, u,vECi, 
(which we consider as an interval modulo p) such that 

|/| > p/|Ci|-l = 

and such that no element of C\ is congruent modulo p to an element of /. Clearly, 
then, one of the following two elements 

v — max c', or u — min d 

dec dec 

(here, this d is thought of an an element of (—p/2, p/2]) has a unique representation 
as a difference. The reason we need this either-or is that all the elements of C could 
be negative. 

Now we define the functions 

v(x) := \{(ci,c 2 ) G Ci x C 2 : c\ - c 2 = x}\; and, 
u'(x) := \{( Cl ,d 2 ) G d X C' : Ci - c' 2 = x}\. 
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We claim that with probability at least 1/2 we will have that 

for every x € ¥ p , v(x) > 20\B 2 \ (log |.Bi|) 2 /logp => v'{x) > 2. 

(3) 

To see this, fix x 6 C\ — C 2 . Then, v'{x) is the following sum of independent 
Bernoulli random variables: 

v{x) 

v '(x) = ^Xj, where Prob(X,- = 1) = (logp)/3|B 2 | log 
i=i 

The variance of u'(x) is 

o- 2 = i/( x )Var(Xi) < v(x)E(X{). 

We now will need the following well-known theorem of Chernoff: 

Theorem 3 (Chernoff 's inequality) Suppose that Z\, Z n are independent ran- 
dom variables such that E(2j) = and \Zi\ < 1 for all i. Let Z := Yji^i; an d let a 2 
be the variance of Z. Then, 

Prob(|Z| > 5a) < 2e~ 52/4 , for any < 5 < 2a. 

We apply this theorem using Z, = Xi — E(Xj) and 

5a = u(x)E(X 1 ) - 1. 

and then quickly deduce that if u(x) > 20|i?2 1 (log \B\ |) 2 / logp, then 

Prob(z^'(2;) < 1) < 2exp(-<5 2 /4) < l/2|5i|, 

for p sufficiently large. Clearly, then, with probability at least 1/2 we will have that 
Q holds for all x, as claimed. But we also had that ([2]) holds with probability 
at least 1/2; so, there is an instantiation of the set B' such that both ([3]) and ([2]) 
hold. Since we proved that such B' has the property that there is an element of 
x eBx-B' having v'(x) = 1, it follows from (J3} that v(x) < 20|S 2 |(log |5i|) 2 /logp, 
which proves the first part of our lemma. 

Now we prove the second part of the lemma: First, the lemma is obviously true 
in the case \B%\ = 1, so we assume that \B%\ > 2. Since we are also assuming that 
I-B2I < logp/31og |-Bi|, we have by the Dirichlet Box Principle there exists m such 
that for every x £ C 2 := m-B^ we have \x\ < p/|i?i| 3 ; furthermore, by the pigeonhole 
principle there exists an integer interval I := (u, v) n Z with u,v G C\ := m ■ B\, 
with \I\ > p/\B\ \ — 1, which contains no elements of B\. So, either 

v — max x or u — min x 

xeC2 xeC-2 

has a unique representation as a difference c\ — c 2 , c\ S C%, c 2 S C 2 . The same 
holds for B\ — i? 2 , and so our lemma is proved. 
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3 Proof of Theorem [T] 

We apply this last lemma with 

B = A = {ai,...,ak}, so t = k. 
Then, let d be as in the lemma, and let 

satisfy 

a y - a x = d. 

We define 

g{n) := e 2mdn ^f(n), 

and note that 

(/*/)(«) > |(«7*/)(n)| 
So, our theorem is proved if we can show that (g*f)(n) is often non-zero. Proceeding 
in this vein, let us compute the Fourier transform of g * f: First, we have that 

9(a) = Y, n g{n)e 2man 'P = Y, n f(n)e 2m < a+d ^ p = f(a + d). 

So, by Fourier inversion, 

(f*g)(n) = p-V^/'/CO/Cov) + E{n), (4) 
where E{n) is the "error" given by 

Note that for every value of i ^ x we have that 

either a or a + d lies in {etfc+i, a p } 

\f(a)f(a + d)\ < 7 |A fc |max{|/(a)|, \f(a + d)\}. 

(5) 

To finish our proof we must show that "most of the time" is smaller than 

the "main term" of ©; that is, 

\E{n)\ < p- l \f{a x )f(a y )\. 

Note that this holds whenever 

\E(n)\ < p-W. (6) 
We have by Parseval and © that 

Zn\E(n)\ 2 = P~ l ^ x \f{ai)\ 2 \f{a l + d)\ 2 

< 2p- 1 7 2 |A fc | 2 Eal/K)i 2 

< 2 7 2 |A fc | 2 /(0). 

So, the number of n for which ([6|) holds is at least 

p(l - 2 7 2 |A fc |- 2 /(0)p) = p(l-2p 2 e 7 2 \X k \- 2 ), 

as claimed. 
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4 Proof of Theorem [2] 



Let 

Bi := B 2 := A = {ai, a k }. 

Suppose initially that 3\A\ log \A\ > logp, so that the hypotheses of the first part 
of Lemma [3] hold. We have then that there exits d\ £ B\ — B2 = A — A with at most 
20|v4| (log |A|) 2 / logp representations as d\ = a — b, a, b £ A. Let now A\ denote the 
set of all the elements b that occur. Clearly, 

I Ai| < 20|A|(log|A|) 2 /logp. 

Keeping B\ = A, we reassign B2 = Ai. So long as 3|Ai|log|A| > logp we 
may apply the first part of Lemma and when we do we deduce that there exists 
d,2 £ A — Ai having at most 20|Ai|(log |A|) 2 /logp representations as d,2 = a — b, 
a £ A, b £ Ai. Let now A2 denote the set of all elements b that occur. Clearly 

|A 2 | < 20|A 1 |(log|A|) 2 /logp. 

We repeat this process, reassigning B2 = A 2 , then B2 = A3, and so on, all the 
while producing these sets Ai,A2,... and differences di,d,2,---, until we reach a set 
A m satisfying 

3|A m |log|A| < logp. 

We may, in fact, reach this set A m with m = 1 if 3|A| log |A| < logp. 
It is clear that since at each step we have 

\Ai\ < 20|A_ 1 |(log|A|) 2 /logp, 

and since we have assumed that 

|A| < (logp)'" 1 (5tloglogp)" 2 * +2 , 

we will reach such a set with m of size at most 

m < t — 1. 

This set A m will have the property, by the second part of Lemma El that there 
exists d m G A — A m having a unique representation as d m = a — b, a € A, b G A m . 
Now, we claim that there exists unique b S ¥ p such that 

b, b + di, b + d 2 , b + d m € A. 

To see this, first let b € A. Since b + d\ € A we must have that b E A%, by definition 
of Ai. Then, since b + c?2 S A, it follows that 6 € A 2 - And, repeating this process, 
we eventually conclude that b € A m . 

So, since 6 € A m , and 6 + d m € A, we have d m = a — b, a £ A, b £ A m . But this 
d m was chosen by the second part of Lemma [3] so that it has a unique representation 
of this form. It follows that b £ A is unique, as claimed. 
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From our funciton / : F p — > [0, 1], we define the functions 51,52, ■■■,9m '■ F p —> C 

via 

/i(n) := e 2 ™^/(n). 

It is obvious that 

support(/ */*•••/* gi * g 2 * ■■■ * g m ) C support(/ */*•■■*/), 

where there are t convolutions on the left, and t on the right; so, / appears t — m 
times on the left. 
We also have that 

9i{a) = f( a + di), 

and therefore 

(/*/*•■■ *T^9i *---*9m)(a) = f(ay~ m f(a + ^/(a + d 2 ) • ■ ■ /(a + d m ). 

Since there exists unique a, call it x, such that all these a + d% belong to A, we 
deduce via Fourier inversion that for any n € F p , 

(f*f*---*9i*---*9 m )(n) = p- 1 e- 2mnx /t>f(x) t - m f(x + d 1 )---f(x + d m ) + E(n), 
where the "error" E(n) satisfies, by the usual L 2 — L°° bound, 

\E(n)\ < t|A fc+1 |(9*-V- 4 Eal/(a)| 2 < HOp)*- 2 ^. 

So, whenever this is smaller than that main term, we have that the convolution is 
non-zero, and therefore so is (/*/*•■■* /)(«)■ This occurs if 

*7(0p)*" 2 |A fc | < p-'lAfcl*, 

which holds whenever t > 2 and 

7 < r^-^dAfcl/p)*- 1 . 
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